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Abstract — Unmanned  vehicles  are  emerging  as  an  attractive 
tool  for  persistent  monitoring  tasks  of  a  given  area,  but 
need  automated  planning  capabilities  for  effective  unattended 
deployment.  Such  an  automated  planner  needs  to  generate 
collision-free  coverage  paths  by  steering  waypoints  to  locations 
that  both  minimize  the  path  length  and  maximize  the  amount  of 
information  gathered  along  the  path.  The  approach  presented 
in  this  paper  significantly  extends  prior  work  and  handles 
motion  uncertainty  of  an  unmanned  vehicle  and  the  presence  of 
obstacles  by  using  a  Markov  Decision  Process  based  approach 
to  generate  collision-free  paths.  Simulation  results  show  that  the 
proposed  approach  is  robust  to  significant  motion  uncertainties 
and  reduces  the  probability  of  collision  with  obstacles  in  the 
environment. 

I.  Introduction 

There  are  many  applications  that  need  persistent  monitor¬ 
ing  of  a  given  area,  requiring  repeated  travel  over  the  area  to 
gather  new  information.  Unmanned  vehicles  are  well-suited 
for  performing  such  tasks,  but  require  automated  planning. 
Moreover,  certain  locations  in  the  area  are  designated  as  key 
locations  and  hence  are  more  valuable  from  the  information 
gathering  perspective.  Surveillance  platforms  in  persistent 
monitoring  tasks  vary  from  aerial  vehicles  to  surface  vehicles 
(e.g.,  boats)  depending  upon  the  application. 

Consider  the  use  of  an  unmanned  surface  vehicle  (USV) 
conducting  harbor  patrols  to  detect  intruders.  It  is  reasonable 
to  assume  that  possible  intruders  will  enter  the  harbor  (Fig. 
1(a))  from  certain  locations  such  as  harbor  entrances  and 
shipping  channels.  This  suggests  the  use  of  an  “information 
value  map”  (see  Fig.  1(b))  that  signifies  how  some  regions 
are  more  dynamic  or  interesting  and  should  be  observed  more 
often.  Though  the  underlying  technical  approach  developed 
in  this  paper  is  intended  for  application  to  USV  harbor 
patrolling,  it  is  applicable  to  many  different  domains. 

The  problem  of  traveling  over  an  area  and  gathering 
information  can  be  viewed  as  a  coverage  planning  problem 
[1].  There  are  many  practical  and  theoretical  challenges  in 
solving  the  coverage  planning  problem  in  the  context  of 
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Fig.  1.  (a)  An  example  of  a  harbor  patrol  environment  with  multiple  entry 

points  for  intruders  (A  harbor  in  Hollywood,  FL;  source:  Map  data  ©2013 
Google),  (b)  Obstacle  regions  (black)  and  the  “information  value”  map 


regions  of  varying  information  value.  It  is  possible  for  static 
obstacles  to  exist  around  the  harbor,  such  as  shoals.  Further, 
windy  conditions  or  swift  currents  can  contribute  significant 
uncertainty  to  the  USV’s  location  and  motion,  compounding 
the  problem.  This  suggests  using  a  physics-aware  planner 
that  is  capable  of  planning  under  motion  uncertainty  while 
avoiding  obstacles  in  the  environment. 

We  begin  with  a  short  review  of  existing  techniques  for 
informative  path  planning  for  coverage  planning  problems. 
Branch  and  bound  search  was  used  in  [2]  to  maximize  the 
“informativeness”  of  a  plan.  In  [3],  Fisher  information  matri¬ 
ces,  combined  with  rapidly-exploring  random  trees  (RRTs), 
was  used  for  information-rich  path  planning. 

Persistent  sensing  approaches  [4]  model  the  information 
uncertainty  in  the  environment  as  a  field  defined  over  a  set 
of  locations  and  assume  that  the  field  increases  linearly  at 
locations  beyond  the  sensing  range  of  the  robot  and  decreases 
linearly  at  locations  within  the  robot’s  range.  Control-based 
approaches  involving  locational  optimization  include  [5]- 
[7].  There  also  exist  information-theoretic  approaches  that 
specify  which  areas  are  of  interest  and  attempt  to  maximize 
the  informativeness  of  a  plan.  In  [8],  agents  followed  the 
gradient  of  mutual  information  to  minimize  total  entropy.  In 
[9],  a  data  structure  called  a  probabilistic  quad  tree  was  pro¬ 
posed  and  used  to  provide  a  tree  structure  to  the  total  entropy 
in  the  environment  during  target  search.  Other  examples  of 
information-theoretic  coverage  planning  work  include  [10], 
[11],  Another  approach  to  solve  the  problem  of  interest  is 
direct  planning  using  graph  search.  Conceptually,  one  could 
pick  areas  of  interest  in  the  environment  and  find  the  optimal 
path  connecting  fixed  nodes,  which  is  equivalent  to  solving 
the  Traveling  Salesman  Problem  (TSP).  While  TSP  is  an 
NP -complete  problem,  many  approximate  solutions  exist  and 
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(c)  Iteration  431:  end  of  obstacle  (d)  Proposed  modification, 
free  dynamics. 

Fig.  2.  An  example  simulation  of  the  proposed  algorithm  for  generating 
collision-tree  informative  paths  under  motion  uncertainty.  The  background 
colors  indicate  the  sensor  function,  connected  black  “x’s”  the  waypoints, 
and  the  black  cells  are  (impassable)  obstacles. 


have  been  used  in  path  planning  [12], 

After  analyzing  direct  planning  approaches,  we  noticed 
several  challenges  associated  with  them.  The  nodes  need 
not  necessarily  be  fixed  in  place,  which  is  a  common  as¬ 
sumption  for  search  problems;  new  information  coming  from 
sensors  could  update  the  node  locations  as  areas  become 
no  longer  interesting  while  persistently  monitoring  an  area. 
Alternatively,  the  traversal  cost  to  reach  a  node  could  be 
discovered  to  be  prohibitive,  yet  a  nearby  location  could  be 
reached  to  gather  sufficient  information  while  maintaining 
lower  path  traversal  cost.  All  these  factors  point  to  challenges 
in  selecting  fixed  locations  to  visit  without  considering  the 
path  length/information  gain  trade-off. 

We  believe  that  a  more  continuous,  online  update  policy 
that  also  allows  the  locations  of  nodes  to  change  in  Euclidean 
space  is  preferable  in  our  application.  In  fact,  this  approach 
was  used  in  the  work  by  Soltero  et  al.  [5]  and  our  work  in 
this  paper  is  inspired  by  their  approach.  We  extend  their 
approach  by  developing  a  waypoint  feedback  policy  that 
morphs  the  informative  coverage  plan  to:  (1)  minimize  ex¬ 
pected  path  traversal  cost,  accounting  for  motion  uncertainty 
and  presence  of  obstacles,  and  (2)  maximize  information 
gathered  along  the  path.  We  formulate  the  problem  using 
a  Markov  Decision  Process  (MDP)  based  approach  (See 
Section  II-A,  [13],  [14]).  Paths  generated  using  this  approach 
can  be  executed  on  autonomous  unmanned  surface  vehicles 
described  in  our  prior  work  [15], 


II.  Problem  Formulation  &  Approach 

In  this  section  we  describe  an  approach  for  computing 
an  informative  coverage  path  for  a  vehicle  operating  in  an 
environment  with  static  obstacles  and  motion  uncertainty.  Let 
x  =  {(*,y)}  C  R2  be  the  continuous  state  space.  Let  Xd  = 
{(ij,  y,)}  c  A  be  the  finite,  discrete  regular  grid.  Let  £  = 
(4>,  O)  be  the  environment,  where  <j>:  X  — y  R+  is  the  sensor 
function  that  assigns  the  density  of  information  in  each  state, 
and  O  c  Xd  is  the  set  of  states  occupied  by  obstacles.  Let 
r  =  c  X  be  the  informative  coverage  path  that 

consists  of  a  sequence  of  N  waypoints.  Finally,  since  this  is 
a  persistent  monitoring  task,  we  assume  that  r  is  a  closed 
path,  where  i  =  N  +  1  is  identified  with  i  =  1. 

The  task  is  to  update  r  by  steering  waypoints  to  locations 
that  both  minimize  path  costs  (e.g.,  sum  of  path  lengths 
between  neighboring  waypoints)  and  maximize  the  amount 
of  information  gathered  along  the  path.  These  competing 
objectives  are  formulated  as  a  single-objective  optimization 
problem  in  Section  II-B  and  is  similar  to  the  approach 
proposed  in  [5],  However,  we  extend  the  previously  used 
notion  of  minimizing  path  length  to  factor  motion  uncertainty 
and  the  presence  of  obstacles  by  minimizing  expected  path 
traversal  cost.  Generating  a  single  path  is  not  sufficient  to 
account  for  motion  uncertainty.  Since  the  vehicle  can  deviate 
from  a  planned  path  during  execution,  use  of  a  feedback  plan 
which  consists  of  a  collection  of  state  action  pairs  ( Xi,Ui ) 
is  needed  for  each  xl  e  Xd. 

We  have  formulated  a  Markov  Decision  Process  (MDP) 
problem  for  each  waypoint  so  the  vehicle  generates  collision 
free  feedback  plans  to  traverse  from  waypoint  to  waypoint 
and  can  predict  expected  path  traversal  costs  (see  Section  II- 
A).  The  MDP  problem  is  solved  using  the  probabilistic  value 
iteration  [13]  and  its  solution  (value  function  and  feedback 
plan)  is  directly  integrated  into  the  coverage  plan  using  the 
proposed  algorithm  outlined  in  Section  II-C.  Analysis  in  Sec¬ 
tion  IV  shows  that  this  modification  to  the  waypoint  steering 
algorithm  reduces  the  probability  of  collision  while  executing 
the  coverage  plan  compared  to  previous  informative  coverage 
planning  algorithms. 

A.  Markov  Decision  Process  Formulation 

A  Markov  Decision  Process  (MDP)  is  defined  as  the 
following  tuple  MDP  =  (Xd,  Ud,  0,  L,  Xgoai)  [13].  The 
state  space  of  the  MDP  is  the  regular  grid  Xd  C  R2, 
where  Xgoai  c  Xd  is  a  collection  of  goal  states.  Let  the 
vehicle’s  action  space  be  Ud  =  {—1,0,1}  x  {—1,0,1}, 
which  are  the  free  choices  the  vehicle  can  make  when 
moving  in  Xd.  However,  we  assume  that  there  is  uncertainty 
in  the  vehicle’s  state  transitions.  Given  time  index  k,  nature’s 
actions  on  the  system  state  Ok  are  defined  such  that  the 
vehicle’s  state  transitions  satisfy  zjb+i  =  xu  +  Ok-  Here,  Ok 
is  a  random  variable  that  is  drawn  from  the  sample  space 
©fc  =  Q(xk,Uk)  (see  Fig.  3).  The  probabilities  of  nature’s 
actions  on  the  system  state  are  defined  as  Pr[0fc|xfc,  «k]. 
L  =  L(x,u,0,O)  is  the  cost  functional  for  transitioning 
between  states,  which  also  encodes  the  position  of  obstacles 


xt  0*0*  »*)  Xk 


Fig.  3.  A  graphical  depiction  of  possible  state  transitions  from  xk  to  1 
conditioned  on  the  control  action  uk.  Each  9k  G  Q(xk,  uk)  has  an  assigned 
probability.  Note  that  u  and  0  need  not  reside  within  the  same  space. 


by  assigning  infinite  cost  to  movements  that  cause  the  vehicle 
to  collide  with  obstacles. 

Probabilistic  Backwards  Value  Iteration  (PBVI)  is  used  to 
solve  the  MDP  to  generate  the  value  function  (expected  path 
traversal  cost-to-go)  $  :  Xd  -4  K+  and  optimal  feedback 
plan  7r  :  Xj  -4  Ud  [13].  In  short,  denote  PBVI  :  MDP 
(4>.  7r),  where  $  satisfies  the  Bellman  equation  (1). 


3>(x)  =  minEe[L(i,  u,  0, 0 )  +  $(x  +  0(x,  u))]  (1) 

uEU 

In  (1),  4>(x)  is  minimized  if  the  feedback  plan  u  =  ir(x)  is 
executed.  Assuming  L  is  positive  definite  and  deterministic 
(independent  of  nature’s  action,  i.e.,  L  =  L(x,n(x),G)  > 
0),  it  holds  that  4>(xfe)  =  L(xk,w(xk))  +  E9k[4>(xjt  +  0fc)], 
or  more  importantly 

E0fc[$(xfc+i)]  <  $>(xfc)  (2) 

B.  Objective  Function  and  Waypoint  Location  Optimization 

Let  H  be  the  single  objective  function  defined  in  (3).  The 
goal  is  to  compute  a  waypoint-steering  feedback  policy  ft  = 
ut  that  minimizes  H. 


n  =  f  (3> 

+  E^[(^-I(p.))2  +  (^+1(p.))2i 


»=i 


Here,  Ws  and  Wg  reflect  the  tradeoff  between  the  compet¬ 
ing  objectives  of  steering  waypoints  to  informative  regions 
vs.  reducing  path  length.  Further,  V,  is  the  Voronoi  partition 
of  waypoint  pi.  Voronoi  partitions  have  been  frequently  used 
in  similar  locational  optimization  problems  [6],  [16],  [17].  (3) 
is  similar  to  the  Lyapunov-like  function  candidate  found  in 
[5],  excluding  the  adaptation  parameter. 

The  best  physical  interpretation  of  Li  is  that  it  defines  a 
spring-like  potential  energy  so  that  virtual  springs  connect 
neighboring  waypoints.  Similarly,  virtual  springs  connect 
waypoints  to  the  informative  center  of  mass  of  its  respective 
Voronoi  partition.  Standard  gradient  descent  technique  can¬ 
not  be  used  to  locally  minimize  (3)  due  to  the  construction 
of  3>.  Hence,  we  propose  using  the  following  continuous 
feedback  policy  (4): 


P,  =  ui  =  ^[WaMViei  +  (4) 

T  (Pi))^p,+i (Pi)] 

Here,  My,  =  fv  Ws0(q)dq  is  the  mass  of  the  Voronoi 
partition  Vi  using  the  sensor  function  <f>  as  the  density 
function,  e*  =  Cft  — p,  is  the  vector  difference  of  the  Voronoi 
partition  centroid  and  the  waypoint  pt.  Cyt  =  where 
Lvi  =  fv  lVsq<fi(q)dq  is  the  first  moment.  The  integrals 
defined  over  the  Voronoi  partitions  (e.g.,  Myi  and  Cyj) 
are  well  approximated  by  the  Riemann  sum  of  points  on 
the  regular  grid  Xd-  The  Voronoi  partitions  were  therefore 
approximated  by  calculating  the  nearest  waypoint  (i.e.,  the 
nearest  neighbor)  of  each  x  e  Xd  using  the  k-nearest- 
neighbor  (KNN)  algorithm.  Ki  is  a  positive  gain  constant. 

Also  note  that  ft  =  Myi  +  2 Wg  is  a  normalization 
parameter,  and  hPj(pi)  is  the  descent  direction  of  <bPj.  (4) 
has  the  property  that  if  hPj(pi)  =  V 4>P)  (p, )  then  (4)  is 
equivalent  to  normalized  gradient  descent  of  H  (3). 

When  computing  ^-(4*P]  (p,))2  in  (4),  it  is  assumed 
that  the  subscript  pg  is  fixed  and  hence  (pt))2  = 

(2 $pj(pi))V$Pj(pi).  Also,  we  assume  ^-($w(pi))  =  0 
which  explains  the  use  of  two  terms  in  (3).  The  gradient 
V4>p5  is  not  well  defined  since  the  domain  of  $  is  Xd- 
Instead  of  using  gradient  descent,  we  will  define  a  descent 
direction  h  by  averaging  all  possible  outcomes  9  when 
executing  the  feedback  plan.  It  is  reasonable  to  use  either 
the  mode  (5)  or  expected  value  (6): 

h(x)  :=  hmode{x)  =  arg  max Pr[0|x,  7r(x)]  (5) 

0ee 

h(x)  :=  hmean(x)  =  E[0|s,  7r(x)]  (6) 

-  ? 

However,  verifying  the  desirable  identity  4>Pj  (pt  +  rjh)  < 
4>Pj  (p^  with  r]  G  (0, 1]  is  a  difficult  task  since  $  is 
not  convex  with  obstacles  present.  For  example,  we  can¬ 
not  invoke  Jensen’s  inequality  to  show  that,  for  example, 

? 

$(E[x  +  0])  <  E[4>(x  +  0)]  <  4>(x).  It  is  possible  to 
construct  (perhaps  pathological)  counter  examples  where  this 
desired  inequality  fails  to  hold.  When  calculating  the  descent 
direction  h ,  bilinear  spline  interpolation  is  used. 

C.  Proposed  Algorithm 

A  direct  execution  of  the  waypoint  feedback  policy  in  (4), 
denoted  as  ICPS_MDP  algorithm,  results  in  a  rubber-band 
like  contraction  of  the  waypoints  wrapping  around  obstacles, 
with  “attractor  springs”  pulling  waypoints  to  the  informative 
regions.  To  avoid  local  minima,  obstacles  are  ignored  during 
the  computation  of  an  initial  informative  coverage  path.  The 
algorithm  considers  obstacles  in  later  planning  stages  during 
which  it  repairs  the  portions  of  the  path  that  are  invalidated 
by  the  obstacles,  and  optimizes  the  path  with  respect  to 
the  environmental  effects  that  cause  uncertainty  in  robot’s 
motion. 


Algorithm  1  COMPUTElNFORMATIVEPATH(Ad,  OA) 
Require:  A  discrete  grid  Xd,  an  obstacle  region  O  C  X,j,  and  a 
sensor  function  <j>. 

Ensure:  An  informative  coverage  path  r. 

1:  Run  informative  path  shaping  algorithm  ICPS(A7i,  (j>)  [5]  to 
compute  an  informative  coverage  path  r  that  ignores  obstacles 
O. 

2:  Run  FindCollisionFreeWaypoints(t,£>)  to  find  collision- 
free  neighboring  waypoints  of  the  waypoints  that  are  inside  O. 

3:  Run  the  PBVI  algorithm  [13]  to  compute  (<f>Pi ,  nPi)  of  each 
waypoint  p-i  £  r.  Update  the  informative  coverage  path  r  be¬ 
tween  pi  and  its  predecessor  p;_i  and  add  additional  waypoints 
along  the  mode  path  connecting  p;  and  p,  _  i . 

4:  Run  ELIMINATE  WAYPOINTS  (r,  (/>)  to  remove  redundant  (too 
close  to  other  waypoints)  or  uninteresting  waypoints  (cover  a 
region  of  the  environment  where  little  information  is  present). 

5:  Run  ICPS_MDP(t,  (j>)  algorithm  to  smooth  the  informative 
coverage  path. 

6:  Run  EliminateWaypoints(t,  (j>)  to  remove  redundant  or 
uninteresting  waypoints. 

7:  Run  ICPS_MDP(t,  (j>)  algorithm  to  smooth  the  informative 
coverage  path. 

8:  return  t. 


Algorithm  2  ICPS_MDP(r,  <t>) 

Require:  An  informative  coverage  path  r  and  a  sensor  function  <j>. 
Ensure:  An  informative  coverage  path  r. 

1:  while  ||u||  >  e  do 
2:  for  all  Pi  £  t  do 

3:  Compute  Voronoi-like  partition  Vi  using  k-nearest- 

neighbor  (KNN)  algorithm  for  pi. 

4:  Run  PBVI  to  generate  (<f>Pi ,  nPi). 

5:  Integrate  the  system  dynamics  of  p;  =  m  =  f(pi)  using 

(4)  and  the  Euler  approximation  [13],  i.e.  Pk+i,i  =  Pk,i  + 

Uk,i  '  Affc 

6:  end  for 

7:  k  ■£-  k  +  1 

8:  end  while 
9:  return  r 


Coverage  plan  r  is  executed  by  starting  at  one  waypoint 
Pi-i,  and  using  the  feedback  plan  7rPi.  Once  the  vehicle 
reaches  the  goal  state  of  p, ,  then  the  planner  follows  the  next 
waypoint  feedback  plan  7tPi+1.  It  is  important  to  note  that  for 
Alg.  4,  waypoint  statistics  are  correlated  with  their  neighbors, 
so  eliminating  too  many  waypoints  at  once  may  cause  entire 
sections  of  the  path  to  disappear.  We  also  typically  assume 
that  q  =  r. 

D.  Calculating  Plan  Execution  Success  Probabilities 

Calculating  the  probability  of  collision  while  executing 
coverage  plan  r  is  a  nontrivial  task,  because  all  possible 
trajectories  leaving  from  one  waypoint  p,  to  get  to  the  next 
waypoint  Pi+i  must  be  considered.  Given  the  MDP  and  its 
solution,  we  define  a  Markov  Chain  MC  =  (X,i,  7’)  on  X,i 
with  transition  probabilities  given  by  0(:r,  7r(a:)),  defining 
state  transition  matrix  T  =  [Tjj],  where  TtJ  =  Pr[iEfc+1  = 
j\xk  =  i\-  We  further  assume  that  0  satisfies  the  property 
that  if  Xi  £  O  U  Xg0ai ,  then  T):?-  =  <5,j,  where  5tj  is  the 
Kronecker  delta  function.  In  other  words,  vehicles  that  enter 
goal  or  obstacle  locations  are  forever  trapped.  The  stationary 


Algorithm  3  FindCollisionFree  Waypoints  (r,  O) 

Require:  An  informative  coverage  path  r,  and  an  obstacle  region 
O  C  Xd. 

Ensure:  An  informative  coverage  path  t. 

1:  Compute  Od  by  dilating  O  using  the  convolution  mask  1.3x3, 
or  a  3  x  3  matrix  of  ones. 

2:  for  all  p,  e  t  do 

3:  if  pi  £  Od  then 

4:  Eliminate  pi  if  the  local  direction  of  r  does  not  change 

more  than  5  degrees,  i.e.,  if  |#,j  <  5°.  Here  cos#,  = 

\\lllil\m~l\\  and  =  Pi~  Pj  is  the  vector 

difference  of  pj  and  p;. 

5:  else 

6:  if  3  an  adjacent  waypoint  pj  £L  Od  then 

7:  Apply  a  variant  of  the  bisection  numerical  method  to 

push  pi  along  the  line  segment  pXPj  the  minimum  dis¬ 
tance  until  p-i  £L  Od,  or  the  length  of  the  segment  being 
divided  decreases  below  a  user-specified  threshold. 

8:  else 

9:  Search  for  a  waypoint  pj  ^  Od  along  the  line  seg¬ 

ments  pi, pi-i  and  pi,pi+\  using  the  Van  Der  Corput 
sequence  [13]  until  the  first  pj  ^  Od  is  found  or 
the  mesh  of  points  searched  decreases  below  a  user- 
specified  threshold. 

10:  end  if 

11:  if  pj  qL  Od  is  not  found  then 

12:  Eliminate  pi. 

13:  end  if 

14:  end  if 

15:  end  for 

16:  return  t. 


Algorithm  4  Eliminate  Waypoints  (r,  </>) 

Require:  An  informative  coverage  path  r  and  a  sensor  function  </>. 

Ensure:  An  informative  coverage  path  t. 

1:  Calculate  mass  and  area  of  each  waypoint’s  pi  £  r  Voronoi 
partition  [13]  with  respect  to  the  sensor  function  </>. 

2:  Select  q,r  £  {0, 1,  2, . . . ,  |r|  -  1}. 

3:  Sort  the  waypoints  of  the  informative  coverage  path  r  in 
ascending  order  with  respect  to  their  associated  mass  and  area. 

4:  Keep  any  waypoint  that  exceeds  the  qth  order  mass  statistic  and 
the  rth  order  area  statistic. 

5:  return  t. 


distribution  Y  of  T  (satisfying  Y  =  YT)  can  be  used  to 
calculate  the  probability  of  success  of  a  coverage  plan  being 
executed.  Y  is  a  probability  mass  function  in  the  belief  space 
of  Xd,  i.e.,  yW  =  Pr[X  =  Xi\  \/x i  £  Xd-  The  event  {xi  £ 
O}  denotes  that  a  collision  occurred,  while  the  event  { £ 
Xgoai}  denotes  a  successful  arrival  at  the  goal  waypoint. 
Given  Y,  the  probabilities  of  these  events  can  be  calculated. 
We  approximate  Y  numerically  by  executing  the  fixed  point 
method  Yfc+i  =  Y^T  until  it  converges  (T  is  sparse).  Note 
that  Y  exists  but  is  not  necessarily  unique.  However,  we  are 

most  concerned  with  the  limiting  distribution  Y  when  Yo  is 

(i) 

a  delta  distribution  centered  at  the  previous  waypoint.  Psucc 
is  the  probability  of  reaching  pi+1  from  p,  without  collision 
and  PSucc  is  the  probability  of  successfully  executing  the 
coverage  plan  r,  i.e.,  PSUcc  =  nh  Psucc • 


III.  Experimental  Setup 

A  set  of  three  distinct  environments  were  created:  “Gaus¬ 
sian  bump  grid”  £\  (see  Fig.  2  and  Fig.  4),  “trail  fork”  £2 
(see  Fig.  5),  and  “barricaded  hot  spots”  £3  (see  Fig.  5). 
These  environments  were  augmented  by  inclusion  of  a  wind 
gust  occupancy  grid  £  =  £  x  m  =  (</>,  0, 111)  [18],  where 
m  is  a  collection  of  Bernoulli  random  variables  that  define 
the  probability  of  a  gust  of  wind  occurring  from  the  north 
(in  the  —x  direction)  in  a  particular  cell.  In  other  words, 
Xi  €  Xj  is  assigned  a  particular  m*  e  m.  We  assume 
that  Pr[0fc|xjfc,Ufc,mfc]  can  be  more  readily  defined,  and 
can  be  marginalized  out  yielding  ©(xjt  ,  u*)  using  the  law  of 
total  probability  (the  distribution  of  111  is  given).  This  allows 
us  to  localize  the  locations  of  high  probability  wind  gusts 
to  particular  regions  in  the  environment,  highlighting  that 
this  induces  local  effects  in  the  informative  plan.  For  each 
environment,  £a,  a  e  {1, 2, 3},  two  different  occupancy  grids 
mo  and  ma  were  used,  yielding  a  total  of  6  simulations 
using  augmented  environments.  In  other  words,  the  6  envi¬ 
ronments  tested  were  {(£a,mo)}u{(£a,ina)}.  Here,  mo  is 
a  uniform  low  probability  wind  gust  occupancy  grid,  where 
Pr[m,  =  1]  =  0.1  Vi,  and  ma  was  tailored  for  its  respective 
environment  £a  to  have  high  probability  wind  gust  regions 
properly  located  to  attempt  to  blow  the  vehicle  into  a  given 
obstacle. 

To  ease  visualization  purposes,  Prfnii  =  1]  is  constrained 
to  be  either  0.1  or  1.  The  locations  where  Pr[mi  =  1]  =  1  are 
denoted  by  a  green  downward  pointing  triangle  to  indicate 
the  direction  the  wind  is  blowing. 

For  all  simulation  runs,  the  descent  direction  h  =  hmode 
was  selected  (5).  Apart  from  the  environment,  simulation 
parameters  for  all  runs  were  identical,  including  the  ini¬ 
tial  waypoint  selection  in  Fig.  2(a).  Obstacles  were  con¬ 
structed  such  that  even  after  obstacle  dilation,  the  free  space 
X o,free  =  Xd  \  Od  remains  fully  connected. 

IV.  Simulation  Results 

After  executing  the  proposed  algorithm,  the  resulting 
locations  of  the  waypoints  for  the  environments  “Gaussian 
bump  grid”  is  shown  in  Fig.  4,  while  both  “trail  fork”  and 
“barricaded  hot  spots”  are  in  Fig.  5.  The  coordinate  frame 
follows  the  North-East-Down  (NED)  convention. 

One  important  performance  metric  previously  unspecified 
in  the  informative  path  planning  literature  is  the  probability 
of  path  execution  success  (conversely,  probability  of  obstacle 
collision  during  path  execution)  while  executing  the  feedback 
plans  to  traverse  between  waypoints.  For  each  simulation,  we 
compare  the  vehicle’s  probability  of  success  while  executing 
the  specified  coverage  plan.  We  compare  the  waypoints 
selected  using  the  original  algorithm  ICPS  proposed  in  [5] 
to  the  ICPS_MDP  algorithm.  Note  that  ICPS  is  a  part  of 
ICPS_MDP,  so  the  coverage  plans  generated  by  the  original 
algorithm  are  shown  in  Figs.  4(a),  5(a),  and  5(b).  Since  the 
original  ICPS  algorithm  has  no  knowledge  of  obstacles  in 
the  environment,  waypoints  inside  obstacles  are  eliminated, 
and  MDP-based  feedback  plans  are  generated  to  steer  the 
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(a)  Waypoint  locations  after  exe-  (b)  Wind  blowing  from  north  to 
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Fig.  4.  “Gaussian  bump  grid”  environment  simulation  results.  This  is  the 
same  environment  as  shown  in  Fig.  2,  contrasting  the  ICPS  and  ICPS_MDP 
simulation  results.  Notice  here  that  the  north  wind  present  in  the  bottom 
right  in  4(b)  causes  the  vehicle  to  move  waypoints  upwind  to  reduce  the 
probability  of  collision. 


vehicle  around  obstacles  when  computing  plan  execution 
success  probabilities. 

To  test  the  robustness  of  the  coverage  plan,  we  also 
incorporate  uncertainty  into  the  planner’s  estimate  of  © 
so  there  is  a  mismatch  between  the  environment  and  the 
planner’s  model  of  the  environment.  We  define  a  new  wind 
occupancy  grid  whose  parameters  are  a  mixture  distribution 
of  the  true  occupancy  grid,  and  uniform  noise:  Pr[?n,  =  1]  = 
aPr[mi  =  1]  4-  (1  —  a)£,  where  Q  is  a  continuous  R.V.  s.t. 
Ci  ~  Unif(0, 1)  and  we  set  a  =  0.5. 

Twenty  wind  occupancy  grids  m  for  each  simulation  result 
were  drawn  from  Co  and  success  probability  statistics  were 
calculated  in  Table  I. 


Trial 

wind 

1 )  Gauss. 

2)  Trail 

3)  Hot 

ICPS  [5]  |PSUcc.l 

high 

88% 

89% 

95% 

ICPS  [5]  [a] 

high 

0.027 

0.019 

0.016 

ICPS_MDP  [Psucc.] 

high 

99.9% 

94% 

98% 

ICPS.MDP  [<r] 

high 

0.0013 

0.019 

0.0043 

ICP  [5]  [PSUcc.] 

low 

-100% 

-100% 

99.6% 

ICP  [5]  [crj 

low 

10-9 

10-® 

0.014 

ICPSJ4DP  [PSUCc.l 

low 

-100% 

-100% 

-100% 

ICPS_MDP  [<7] 

low 

10-15 

10-9 

10-13 

TABLE  I 

Plan  execution  success  probabilities  for  the  environments. 

V.  Conclusions 

In  this  paper,  we  have  introduced  a  planning  algorithm 
for  computation  of  informative  coverage  plans  for  persistent 
monitoring  tasks  with  static  obstacles.  The  algorithm  morphs 
an  initial  informative  coverage  path  towards  regions  with 
high  information  value,  while  avoiding  obstacles  and  explic¬ 
itly  considering  environmental  effects  that  cause  uncertainty 
in  a  vehicle’s  motion.  Simulation  results  show  that  the 
proposed  algorithm  is  robust  to  motion  uncertainties  and 
hence  reduces  the  probability  of  collision  with  obstacles  in 
the  environment. 

In  future  work,  positioning  uncertainty  stemming  from 
sensor  inaccuracies  can  be  modeled  by  using  a  Partially 
Observable  MDP  [18].  For  this  work,  the  sensor  function 
has  been  given  a  priori  as  a  part  of  the  environment  The 
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(a)  “Trail  fork”:  waypoint  ioca-  (b)  “Hot  spots”:  waypoint  loca¬ 
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(c)  “Trail  fork”:  low  occurrence  of  (d)  “Hot  spots”:  low  occurrence  of 
wind.  wind. 
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(e)  ‘Trail  fork”:  wind  blowing  (0  “Hot  spots”:  wind  blowing 
from  north  to  south.  from  north  to  south. 


Fig.  5.  For  “TVail  fork”  simulation  results:  the  waypoints  near  the  wind 
field  in  the  top  right  comer  seems  to  work  correctly  and  are  pushed  to  the 
wind  gust  boundary.  Otherwise,  wind  effects  seem  to  be  negligible  in  this 
environment.  However,  the  planner  does  not  seem  to  have  larger  clearances 
for  the  lower  obstacles.  For  “Barricaded  hot  spots”  simulation  results:  notice 
what  appears  to  be  an  extraneous  traversal  into  the  region  in  the  top  right  of 
both  figures.  This  could  be  a  consequence  of  the  relative  weighting  Ws,  Wg 
of  costs  in  (3).  A  further  improvement  of  waypoint  elimination  heuristics 
could  reduce  the  number  of  waypoints  in  the  bottleneck. 


characterization  of  relevant  sensor  functions  using  onboard, 
possibly  noisy  sensors  selected  for  a  particular  mission 
deserves  further  study. 
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